corner case
Description of Corner Cases in Automated Driving: Goals and Challenges
Bogdoll, Daniel, Breitenstein, Jasmin, Heidecker, Florian, Bieshaar, Maarten, Sick, Bernhard, Fingscheidt, Tim, Zöllner, J. Marius
Scaling the distribution of automated vehicles requires handling various unexpected and possibly dangerous situations, termed corner cases (CC). Since many modules of automated driving systems are based on machine learning (ML), CC are an essential part of the data for their development. However, there is only a limited amount of CC data in large-scale data collections, which makes them challenging in the context of ML. With a better understanding of CC, offline applications, e.g., dataset analysis, and online methods, e.g., improved performance of automated driving systems, can be improved. While there are knowledge-based descriptions and taxonomies for CC, there is little research on machine-interpretable descriptions. In this extended abstract, we will give a brief overview of the challenges and goals of such a description.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > Tennessee > Shelby County > Memphis (0.04)
- North America > United States > Michigan > Wayne County > Detroit (0.04)
- (8 more...)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology > Robotics & Automation (0.95)
Addressing Corner Cases in Autonomous Driving: A World Model-based Approach with Mixture of Experts and LLMs
Liao, Haicheng, Wang, Bonan, Yang, Junxian, Wang, Chengyue, He, Zhengbin, Zhang, Guohui, Xu, Chengzhong, Li, Zhenning
Accurate and reliable motion forecasting is essential for the safe deployment of autonomous vehicles (AVs), particularly in rare but safety-critical scenarios known as corner cases. Existing models often underperform in these situations due to an over-representation of common scenes in training data and limited generalization capabilities. To address this limitation, we present WM-MoE, the first world model-based motion forecasting framework that unifies perception, temporal memory, and decision making to address the challenges of high-risk corner-case scenarios. The model constructs a compact scene representation that explains current observations, anticipates future dynamics, and evaluates the outcomes of potential actions. To enhance long-horizon reasoning, we leverage large language models (LLMs) and introduce a lightweight temporal tokenizer that maps agent trajectories and contextual cues into the LLM's feature space without additional training, enriching temporal context and commonsense priors. Furthermore, a mixture-of-experts (MoE) is introduced to decompose complex corner cases into subproblems and allocate capacity across scenario types, and a router assigns scenes to specialized experts that infer agent intent and perform counterfactual rollouts. In addition, we introduce nuScenes-corner, a new benchmark that comprises four real-world corner-case scenarios for rigorous evaluation. Extensive experiments on four benchmark datasets (nuScenes, NGSIM, HighD, and MoCAD) showcase that WM-MoE consistently outperforms state-of-the-art (SOTA) baselines and remains robust under corner-case and data-missing conditions, indicating the promise of world model-based architectures for robust and generalizable motion forecasting in fully AVs.
- Asia > Macao (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (3 more...)
- Transportation > Ground > Road (1.00)
- Transportation > Infrastructure & Services (0.93)
Challenges and opportunities in portraying emotion in generated sign language
McDonald, John C., Wolfe, Rosalee, Nunnari, Fabrizio
Non-manual signals in sign languages continue to be a challenge for signing avatars. More specifically, emotional content has been difficult to incorporate because of a lack of a standard method of specifying the avatar's emotional state. This paper explores the application of an intuitive two-parameter representation for emotive non-manual signals to the Paula signing avatar that shows promise for facilitating the linguistic specification of emotional facial expressions in a more coherent manner than previous methods. Users can apply these parameters to control Paula's emotional expressions through a textual representation called the EASIER notation. The representation can allow avatars to express more nuanced emotional states using two numerical parameters. It also has the potential to enable more consistent specification of emotional non-manual signals in linguistic annotations which drive signing avatars.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Greece > Attica > Athens (0.04)
- (2 more...)
SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval-Augmented Generation
Ye, Hao, Qi, Mengshi, Liu, Zhaohong, Liu, Liang, Ma, Huadong
In this work, we study how vision-language models (VLMs) can be utilized to enhance the safety for the autonomous driving system, including perception, situational understanding, and path planning. However, existing research has largely overlooked the evaluation of these models in traffic safety-critical driving scenarios. To bridge this gap, we create the benchmark (SafeDrive228K) and propose a new baseline based on VLM with knowledge graph-based retrieval-augmented generation (SafeDriveRAG) for visual question answering (VQA). Specifically, we introduce SafeDrive228K, the first large-scale multimodal question-answering benchmark comprising 228K examples across 18 sub-tasks. This benchmark encompasses a diverse range of traffic safety queries, from traffic accidents and corner cases to common safety knowledge, enabling a thorough assessment of the comprehension and reasoning abilities of the models. Furthermore, we propose a plug-and-play multimodal knowledge graph-based retrieval-augmented generation approach that employs a novel multi-scale subgraph retrieval algorithm for efficient information retrieval. By incorporating traffic safety guidelines collected from the Internet, this framework further enhances the model's capacity to handle safety-critical situations. Finally, we conduct comprehensive evaluations on five mainstream VLMs to assess their reliability in safety-sensitive driving tasks. Experimental results demonstrate that integrating RAG significantly improves performance, achieving a +4.73% gain in Traffic Accidents tasks, +8.79% in Corner Cases tasks and +14.57% in Traffic Safety Commonsense across five mainstream VLMs, underscoring the potential of our proposed benchmark and methodology for advancing research in traffic safety. Our source code and data are available at https://github.com/Lumos0507/SafeDriveRAG.
- Asia > China > Beijing > Beijing (0.06)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
Red-Team Multi-Agent Reinforcement Learning for Emergency Braking Scenario
Chen, Yinsong, Wang, Kaifeng, Meng, Xiaoqiang, Li, Xueyuan, Li, Zirui, Gao, Xin
Current research on decision-making in safety-critical scenarios often relies on inefficient data-driven scenario generation or specific modeling approaches, which fail to capture corner cases in real-world contexts. To address this issue, we propose a Red-Team Multi-Agent Reinforcement Learning framework, where background vehicles with interference capabilities are treated as red-team agents. Through active interference and exploration, red-team vehicles can uncover corner cases outside the data distribution. The framework uses a Constraint Graph Representation Markov Decision Process, ensuring that red-team vehicles comply with safety rules while continuously disrupting the autonomous vehicles (AVs). A policy threat zone model is constructed to quantify the threat posed by red-team vehicles to AVs, inducing more extreme actions to increase the danger level of the scenario. Experimental results show that the proposed framework significantly impacts AVs decision-making safety and generates various corner cases. This method also offers a novel direction for research in safety-critical scenarios.
- Leisure & Entertainment > Games (0.91)
- Transportation (0.70)
ECCV 2024 W-CODA: 1st Workshop on Multimodal Perception and Comprehension of Corner Cases in Autonomous Driving
Chen, Kai, Gao, Ruiyuan, Hong, Lanqing, Xu, Hang, Jia, Xu, Caesar, Holger, Dai, Dengxin, Liu, Bingbing, Tsishkou, Dzmitry, Xu, Songcen, Xu, Chunjing, Xu, Qiang, Lu, Huchuan, Yeung, Dit-Yan
In this paper, we present details of the 1st W-CODA workshop, held in conjunction with the ECCV 2024. W-CODA aims to explore next-generation solutions for autonomous driving corner cases, empowered by state-of-the-art multimodal perception and comprehension techniques. 5 Speakers from both academia and industry are invited to share their latest progress and opinions. We collect research papers and hold a dual-track challenge, including both corner case scene understanding and generation. As the pioneering effort, we will continuously bridge the gap between frontier autonomous driving techniques and fully intelligent, reliable self-driving agents robust towards corner cases.
- Personal (0.69)
- Instructional Material > Course Syllabus & Notes (0.46)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)
- Automobiles & Trucks (1.00)
VLM-C4L: Continual Core Dataset Learning with Corner Case Optimization via Vision-Language Models for Autonomous Driving
Hu, Haibo, Zuo, Jiacheng, Lou, Yang, Cui, Yufei, Wang, Jianping, Guan, Nan, Wang, Jin, Li, Yung-Hui, Xue, Chun Jason
With the widespread adoption and deployment of autonomous driving, handling complex environments has become an unavoidable challenge. Due to the scarcity and diversity of extreme scenario datasets, current autonomous driving models struggle to effectively manage corner cases. This limitation poses a significant safety risk, according to the National Highway Traffic Safety Administration (NHTSA), autonomous vehicle systems have been involved in hundreds of reported crashes annually in the United States, occurred in corner cases like sun glare and fog, which caused a few fatal accident. Furthermore, in order to consistently maintain a robust and reliable autonomous driving system, it is essential for models not only to perform well on routine scenarios but also to adapt to newly emerging scenarios, especially those corner cases that deviate from the norm. This requires a learning mechanism that incrementally integrates new knowledge without degrading previously acquired capabilities. However, to the best of our knowledge, no existing continual learning methods have been proposed to ensure consistent and scalable corner case learning in autonomous driving. To address these limitations, we propose VLM-C4L, a continual learning framework that introduces Vision-Language Models (VLMs) to dynamically optimize and enhance corner case datasets, and VLM-C4L combines VLM-guided high-quality data extraction with a core data replay strategy, enabling the model to incrementally learn from diverse corner cases while preserving performance on previously routine scenarios, thus ensuring long-term stability and adaptability in real-world autonomous driving. We evaluate VLM-C4L on large-scale real-world autonomous driving datasets, including Waymo and the corner case dataset CODA.
- North America > United States (0.68)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Hong Kong (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Government > Regional Government > North America Government > United States Government (0.34)
Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception
Chen, Luke, Wang, Junyao, Mortlock, Trier, Khargonekar, Pramod, Faruque, Mohammad Abdullah Al
Uncertainty Quantification (UQ) is crucial for ensuring the reliability of machine learning models deployed in real-world autonomous systems. However, existing approaches typically quantify task-level output prediction uncertainty without considering epistemic uncertainty at the multimodal feature fusion level, leading to sub-optimal outcomes. Additionally, popular uncertainty quantification methods, e.g., Bayesian approximations, remain challenging to deploy in practice due to high computational costs in training and inference. In this paper, we propose HyperDUM, a novel deterministic uncertainty method (DUM) that efficiently quantifies feature-level epistemic uncertainty by leveraging hyperdimensional computing. Our method captures the channel and spatial uncertainties through channel and patch -wise projection and bundling techniques respectively. Multimodal sensor features are then adaptively weighted to mitigate uncertainty propagation and improve feature fusion. Our evaluations show that HyperDUM on average outperforms the state-of-the-art (SOTA) algorithms by up to 2.01%/1.27% in 3D Object Detection and up to 1.29% improvement over baselines in semantic segmentation tasks under various types of uncertainties. Notably, HyperDUM requires 2.36x less Floating Point Operations and up to 38.30x less parameters than SOTA methods, providing an efficient solution for real-world autonomous systems.
- Europe (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
DriveGen: Towards Infinite Diverse Traffic Scenarios with Large Models
Zhang, Shenyu, Tian, Jiaguo, Zhu, Zhengbang, Huang, Shan, Yang, Jucheng, Zhang, Weinan
Microscopic traffic simulation has become an important tool for autonomous driving training and testing. Although recent data-driven approaches advance realistic behavior generation, their learning still relies primarily on a single real-world dataset, which limits their diversity and thereby hinders downstream algorithm optimization. In this paper, we propose DriveGen, a novel traffic simulation framework with large models for more diverse traffic generation that supports further customized designs. DriveGen consists of two internal stages: the initialization stage uses large language model and retrieval technique to generate map and vehicle assets; the rollout stage outputs trajectories with selected waypoint goals from visual language model and a specific designed diffusion planner. Through this two-staged process, DriveGen fully utilizes large models' high-level cognition and reasoning of driving behavior, obtaining greater diversity beyond datasets while maintaining high realism. To support effective downstream optimization, we additionally develop DriveGen-CS, an automatic corner case generation pipeline that uses failures of the driving algorithm as additional prompt knowledge for large models without the need for retraining or fine-tuning. Experiments show that our generated scenarios and corner cases have a superior performance compared to state-of-the-art baselines. Downstream experiments further verify that the synthesized traffic of DriveGen provides better optimization of the performance of typical driving algorithms, demonstrating the effectiveness of our framework.
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Chongqing Province > Chongqing (0.04)
- North America > United States > California (0.04)
- (3 more...)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology (0.90)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Physics-model-guided Worst-case Sampling for Safe Reinforcement Learning
Cao, Hongpeng, Mao, Yanbing, Sha, Lui, Caccamo, Marco
Real-world accidents in learning-enabled CPS frequently occur in challenging corner cases. During the training of deep reinforcement learning (DRL) policy, the standard setup for training conditions is either fixed at a single initial condition or uniformly sampled from the admissible state space. This setup often overlooks the challenging but safety-critical corner cases. To bridge this gap, this paper proposes a physics-model-guided worst-case sampling strategy for training safe policies that can handle safety-critical cases toward guaranteed safety. Furthermore, we integrate the proposed worst-case sampling strategy into the physics-regulated deep reinforcement learning (Phy-DRL) framework to build a more data-efficient and safe learning algorithm for safety-critical CPS. We validate the proposed training strategy with Phy-DRL through extensive experiments on a simulated cart-pole system, a 2D quadrotor, a simulated and a real quadruped robot, showing remarkably improved sampling efficiency to learn more robust safe policies.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (0.93)
- Information Technology > Robotics & Automation (0.68)